Ergodic Classical-Quantum Channels: Structure and 

Coding Theorems 

Igor Bjelakovic and Holger Boche Member, IEEE 



Abstract — We consider ergodic causal classical-quantum chan- 
nels (cq-channels) which additionally have a decaying input 
memory. In the first part we develop some structural properties 
of ergodic cq-channels and provide equivalent conditions for 
ergodicity. In the second part we prove the coding theorem with 
weak converse for causal ergodic cq-channels with decaying input 
memory. Our proof is based on the possibility to introduce a 
joint input-output state for the cq-channels and an application 
of the Shannon-McMillan theorem for ergodic quantum states. 
In the last part of the paper it is shown how this result implies 
a coding theorem for the classical capacity of a class of causal 
ergodic quantum channels. 

Index Terms — Ergodic quantum channels, coding theorems, 
ergodicity, classical- quantum channels 

I. Introduction 

One of the main achievements in quantum information 
theory is the determination of the capacity of memoryless 
quantum channels by Holevo [16], [17], and independently 
by Schumacher and Westmoreland [31], for transmission of 
classical information. These results have been considerably 
sharpened by Winter [38], who extended Wolfowitz's [39] 
approach via frequency typical sequences to the quantum 
setting and obtained coding theorem with strong converse 
for transmission of classical information over memoryless 
quantum channel. At the same time Ogawa and Nagaoka [27] 
proved the strong converse in the memoryless situation by a 
different proof that follows the classic Arimoto's [2] approach. 
Subsequently Shor [33] and Devetak [10] have shown by 
independent proofs that the capacity of the memoryless quan- 
tum channel is given by the coherent information. The weak 
converse to this coding theorem was already established by 
Barnum, Nielsen and Schumacher in [3]. Shor uses the method 
of random selection of subspaces while Devetak chooses an 
approach via private classical capacity and a transformation 
of private classical codes into quantum codes. An interesting 
point in Devetak's approach is that the classical capacity 
results for quantum channels [17], [31], [38] are one of the 
crucial building blocks for the direct part of coding theorem for 
quantum channels. In spite of this progress one of the main 
open questions concerning classical capacity of memoryless 
quantum channels, namely the additivity problem [34], is still 
unresolved. 
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Although memoryless quantum channels play a prominent role 
in the development of the foundations of quantum information 
theory, real-world channels are rarely memoryless. Thus it is 
desirable to have a broad and efficient theory for quantum 
channels with memory. In this paper we will consider causal 
ergodic classical-quantum channels satisfying an additional 
continuity condition, which basically means that the effect 
of the inputs far in the past decays with respect to the 
variational distance. The causality means in this context that 
the outputs up to time t depend only on inputs up to time 
t. Obtained coding results are then applied to a class of 
causal quantum channels in order to obtain their classical 
capacity. The extension to causal ergodic quantum channels 
with decaying input memory is not obvious at all, since several 
ergodicity problems arise. The discussion of these problems 
is postponed to the future work. 

A. Overview and Outline 

In section HI] we introduce general classical-quantum chan- 
nels (cq-channels for short) in analogous fashion to the clas- 
sical setting as a family of conditional states and show how 
this point of view leads to the usual definition as a completely 
positive map between the output-algebra and input-algebra. 
Concepts of stationarity and ergodicity are also introduced 
and an equivalent condition for ergodicity of a stationary cq- 
channel is derived. It states that a stationary cq-channel is 
ergodic iff it is extreme point in the set of all stationary cq- 
channels. Several equivalent formulations of this statement are 
given and it is shown that not all of them can be extended to 
stationary quantum channels simultaneously. 
The second part of section is devoted to the notion of 
continuity of cq-channels. As it was pointed out by McMillan 
in his classic paper [26] the continuity properties of channels 
are crucial for the validity of coding theorems. We formulate 
these continuity notions with respect to the variational dis- 
tance on quantum states. A more natural notion of distance 
would be a metric on quantum states which shares as many 
properties as possible with the classical d— distance, which 
is extremely sensitive to the ergodic and mixing properties of 
channels and probability measures. See e.g. Gray and Ornstein 
[14] for application to classical channels, and books [32], [15] 
by Shields resp. Gray for introduction and applications to the 
ergodic and information theory. To our knowledge, there is 
still no metric on the set of quantum states which could play 
a similar role as the rf— distance. 
Section HI] is closed by some examples. 
After some preliminary results in section JIII] we prove the 
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direct coding theorem and weak converse to it in section 
HVl Our approach combines the maximal code construction 
of Wolfowitz [39], which was already applied by Winter 
[38] for memoryless cq-channels, with the quantum Shannon- 
McMillan theorem from [4], in order to avoid usage of 
frequency typical and conditionally frequency typical projec- 
tions, which are not an adequate tool for correlated quantum 
states. In a sense, this approach is a mixture of Wolfowitz's 
code construction and the version of Feinstein's lemma from 
Blackwell/Breiman/Thomasian in [6] which is based on the 
notion of the joint input-output probability distribution. 
These results are extended to causal cq-channels with decaying 
input memory in section [V] Since our notion of continuity is 
symmetric with respect to past and future it is even possible 
to prove coding theorem for cq-channels with decaying input 
memory and anticipation without any additional complica- 
tions, although we are only interested in causal situations. 
Finally, the results are applied to obtain coding results for 
transmission of classical information via weakly output mixing 
quantum channels in section [Vl] In order to keep the extent 
of this paper reasonable, the extension to ergodic quantum 
channels is postponed to a forthcoming paper. 
For a reader without experience with C*— algebras, states and 
quasi-local algebras we provide an appendix that contains a 
short description of these concepts complemented by some 
standard references and some examples. 

B. Related Work 

In [20] Kretschmann and Werner considered stationary 
causal quantum channels, and have shown that each channel 
of this type can be seen as a concatenation of channels acting 
on the single-site input algebra and memory algebra with the 
range in the joint system of the single-site output algebra and 
memory algebra. In the classical setting this approach coiTe- 
sponds to the point of view, that the past inputs and outputs 
can be seen as the states of the channel for the transmission of 
the actual input letter and the resulting output symbol. If the 
duration of the memory of the input and output is finite this 
class of channels is called finite-state channels and coding 
theorems for them were established by Blackwell, Breiman 
and Thomasian in [5] under the assumption that the channel 
is additionally indecomposable. A different account to finite- 
state indecomposable channels is given in the monograph [12] 
by Gallager. 

Kretschmann and Werner have proved in [20] a general weak 
converse to the coding theorem, and for channels with finite 
memory they derived the direct part of the coding theorem 
under the assumption that the channel is forgetful, which is 
quantum analogon of the notion of the indecomposability. 
Under these circumstances the channels under consideration 
are well approximable by memoryless quantum channels. 
Recently, Datta and Dorlas [8] used an approach via a quantum 
version of Feinstein's Lemma (cf. [11]) to give another proof 
of the direct part of the coding theorem for transmission 
of classical information over a quantum memoryless channel 
and remarked that this approach can be extended to prove 
direct coding theorem for classical information transmission 



via quantum channels with Markovian correlated noise (see 
Section Hl-CI for the definition of this class of channels). This 
class of channels was introduced by Macchiavello and Palma 
in [23] and bounds for capacity were already obtained by 
Bowen and Mancini in [7]. 

C. Notation 

In this paper we will write ^(Y) for the set of C— valued 
functions defined on a finite set Y . The set of linear operators 
(linear maps) acting on a finite-dimensional Hilbert space Ti 
will be denoted by C{TC) and this set will be often abbreviated 
by B. [n, k] for integers n, k with n < k stands for integer 
intervals, i.e. the set of all z e Z that satisfy n < z < k. 
For a given probability measure p on a measurable space 
(fi. E) we will not distinguish between the the measure and 
the expectation functional generated by it, i.e. for an integrable 
C— valued function / on we set 

p{f ) J f{u^)p{dco)- 

Notation concerning C*— algebras, quasi-local algebras B'^ 
built up from B, and states is introduced in the appendix. 
Moreover, the appendix contains the definitions of stationary 
and ergodic states on quasi-local algebras and von Neumann 
entropy rate of stationary states. 

For a finite set A or a finite dimensional C*— algebra B, A" 
and e N, stand for resp. B^^-"\ Restriction of a 

state tp on B^ to ;B" is denoted by i/'"; tr denotes the trace of 
operators. 

II. Classical-Quantum Channels 

Let ^ be a finite set and let 7i be a d-dimensional Hilbert 
space. By we denote the set of doubly infinite sequences 
with components from A and B'^ is the quasi-local C*-algebra 
with B := C{H) (cf. appendix [VlIlBli. Moreover, let S{B^) 
denote the set of states on S^, i.e. the set of positive linear 
normalized functionals on B'^ with values in C. Since A^ can 
be easily equipped with a metric making it a compact space, 
we have a natural notion of Borel cr-field which coincides 
with the cr-field generated by the cylinder sets denoted by Sc. 
Moreover all cylinder sets are open as well as closed with 
respect to this topology. One possibility to introduce such a 
metric on is 

d{x,y) := ^2-l*ldjf(a;,,y,) {x,y £ A^), 

where dn denotes familiar Hamming distance. The properties 
of the resulting topology mentioned above are then stan- 
dard. The set of bounded Borel measurable complex-valued 
functions will be denoted by B{A^,l^c) and it is always 
assumed that this set is endowed with the || • ||oo-norm. Note 
that B{A^,T^c) is a commutative C* -algebra when equiped 
with II • I loo and complex-conjugation as adjoint operation (cf. 
appendix I VII-AI for definition). 

We consider a classical-quantum channel (cq-channel) W with 
the input A^ and output B^, i.e. a map W : y. B^ <C 
with following properties 
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1) For each b E the map x i-^ W{x,b) is Borel 
measurable. 

2) For each x e A'^ the map b i-^ W{x, b) is a state. 
Remark 2.1: The fact that the map b^ W{x,b) is a state 

for all X £ implies that |T4^(a:, 6)| < ||6|| holds and hence 
that the first item above can be sharpened to the statement that 
X ^ W [x ,b) bounded Borel function. 
Note that each cq-channel W : A^ x ^ C defines a 
linear (completely) positive unital map K : B{A^, Sc) 

given by {K{b)){x) :— W{x,b). Conversely, each linear map 
K : B^ ^ B{A^, Sc) with properties mentioned above gives 
rise to a cq-channel via {K{b)){x) —: W{x,b). Consequently 
we have a one-to-one affine correspondence between linear 
positive unital maps and cq-channels. Hence the definition of 
the channel as a completely positive unital map between C*- 
algebras is recovered in this situation. Moreover, if the input 
or output algebra is abelian the mere positivity is sufficient, 
since in this situation each positive linear map is automatically 
completely positive (cf. [36], [28]). We summarize this trivial 
observation to ease later referencing in 

Lemma 2.2: There is a one-to-one affine correspondence 
between cq-channels W : A^ x B'^ C and linear positive 
unital maps K : B^ ^ B{A^, Ec) which is given by 

{K{b))ix)=Wix,b). (1) 
The joint input-output state on B{A^, S^) ^ is defined by 
linear extension of 

^P,wilc ®b):^ [ Wix, b)p{dx) {C e Ec, 6 e B^), (2) 
Jc 

for a given input probability measure p on {A^, E^)- Note that 
the integral above is well defined due to assumed measurability 
of the map A^ 5 x i-^ W{x, b) for each b e B^. It is clear 
that ipp^wi^c (E" 1) = p{C), and now we compute 

= / W{x,b)p{dx) 

= I {K{b)){x)p{dx) (bv Lemma 12:21) 

= v(K{b)\ 

which is the correct formula for the output state. 

The formula in eq. dU can be naturally extended to bounded 

Borel functions instead of indicator functions by setting 

^p.w[j®b)= J f{x)W{x,b)pidx) (3) 

for / e B(A^,Ee),6 e B^. We estabUsh that (H defines a 
state after the following 

Remark 2.3: Note that in general for two C*-algebras A, 
B there are several norms on A B making it C* -algebra. 
However, if one of the factors is nuclear (as are abelian, finite- 
dimensional and quasi-local C* -algebras) there is a unique 
C*-norm on A(g>B (see [19], [28] for details). 

Lemma 2.4: The functional defined by linear extension of 
■4>p,w in eq. © is a state on B{A^, Ec) ® B^. 

Proof: For a given cq-channel W : x B^ ^ C \et 
us consider the corresponding positive, unital linear map K : 
B"^ B{A'^, Ec) from Lemma l272l Let us define the copier 



Copy : A^ ^ A^ X A^, Copy{x) := {x, x), and the induced 
map Copy : B{A^, E J (g> B{A^ , E J B{A^, E^) which is 
given by Copy{f) foCopy (Here we identify B{A^, Ec)§5 
B(A^,Ec) with B{A^ X A^,Ec x E^)). Note that this last 
map is linear, positive and unital, and thus completely positive 
and unital (see discussion preceding Lemma 12.2b . Moreover 
the map id^^A^^j^^) (g) K : B{A^, Ec) d) B^ ^ B{A^, EJ (E) 
B{A^, Ec) is completely positive and unital (see [28] chap. 
12). Let us define E : B{A^, Ec) (g> B^ B{A^, Ec) by 

E :^ Copy o (idsf^z^s.) ® K). (4) 

As composition of completely positive unital maps E is itself 
completely positive and unital. Thus 

{poE){a)^ j {E{a)){x)p{dx) (a G Ec) ® S^) 

defines a state on B{A^, Ec) i^B^, and it is apparent that this 
state coincides with ipp^w on elementary tensors J ® b and 
(finite) linear combinations thereof. Consequently they must 
coincide at all. ■ 
Note that eq. (HJi sets up a one-to-one affine correspondence 
between Unear, positive unital maps K : B^ ^ B{A^, Ec) and 
Hnear positive unital maps E : B{A^, Y.c)<E)B^ B{A^, Ec) 
with 

Eif db) = ^b)^ fK{b), (5) 

as is easily verified. Thus combining this simple observation 
and Lemma 122] we obtain 

Theorem 2.5: There is one-to-one affine correspondence 
between 

1) cq-channels W : A^ x B^ ^ <C, 

2) linear, positive unital maps K : B^ ^ B{A^, Ec) and 

3) linear, positive unital maps E : B{A^,T.c) ® — > 
B{A^, Ec) with 

E{f(g)b) = fE{l(g)b). 

The correspondences are given by eq. ([T]i and eq. (|4]i. 
We emphasize at this point that these results heavily depend on 
the fact, that the input algebra B{A^, Ec) is abelian (commu- 
tative). This can be seen in eq. ^ where we used the universal 
copier Copy, a device which is provably impossible to intro- 
duce in a truly quantum mechanical setting, (see [40], [24], 
[37] for various versions of this fact known as No-Cloning 
theorem). That the affine correspondence in Theorem 12.51 
cannot be valid for general channels (i.e. completely positive 
unital maps K : B ^ A between general C*— algebras) can 
be seen from following theorem called Heisenberg's principle 
in [24] , which states that for each completely positive unital 
map which would give rise to a joint input-output state the 
corresponding channel {K{b) — E{1 (g) b) in our case) has 
necessarily commutative range: 

Theorem 2.6 (Heisenberg's Principle [24]): Let A,B be 
arbitrary unital C*— algebras and let E : A d B A he 
a completely positive unital map with 

E{a(g)lB)^a Va e A 

Then 

E{1a (E>b)e Z[A) V6 G B, 
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where Z{A) denotes the center of A given by 

Z{A) := {a' eA: a' a = aa' Va € A}, 

which is an abehan *— subalgebra of A. 
The proof of this theorem in [24] is given for finite- 
dimensional algebras, but it can be immediately extended to 
arbitrary C*— algebras, since all necessary ingredients are 
valid generally (cf. [28]). Thus, in general situation we have 
merely a completely positive unital map K : B A the 
description of the quantum channel at our disposal. 
A cq-channel W is called stationary if W{TmX, h) — 
W{x,To^,h) holds for all x £ and all b € B^, where 
Tin resp. Tout denotes the shift on the input alphabet resp. 
output algebra. By virtue of Lemma 12.21 this is equivalent to 

K O Tout = Tin o K. 

A cq-channel W is called ergodic, if it is stationary and if the 
joint input-output state tpp^w is ergodic for every stationary 
ergodic probability measure p on (A^, Ec). 



A. Structural Properties of Ergodic CQ-Channels 

At this point we pause with further definitions and give an 
alternative characterization of ergodicity of cq-channels which 
parallels the characterization of ergodic states and probability 
measures. To this end we need the notion of equality of two 
cq-channels Wi and W2'- Stationary cq-channels Wi,W2 : 
X 6^ -> C are equal if for all / e B{A^, S^), b e B^ and 
all stationary p e ViA^, Sc) 

i}p,Wi{f '^b) =ipp^W2{f <^b) (6) 

holds, i.e. if Wi and W2 generate the same stationary joint 
input-output states. Note that this is equivalent to the assertion 
that for all b E B^ and all stationary probability measures p 
we have Wi{x,b) = W2{x,b) almost surely with respect to 
p. In view of Theorem 12.51 this can be rephrased by 

p{fK^{b))^p{}K2{b)), (7) 

or 

p{E^U®b))^p{E2{f®b)) (8) 

for all / e B{A^,Yic),b e and all stationary 

p£V{A^,^c). 

Theorem 2.7: Let W : x B'^ ^ C & stationary cq- 
channel. Then following assertions are equivalent: 

1) is extremal in the convex set of stationary cq- 
channels 

2) W is ergodic 

Proof: 2) ^ 1). Suppose that W is not extremal. Then 
there are stationary cq-channels Wi , W2 '■ x B'^ ^ C, 
Wi ^ W2, and a e (0, 1) with 

W ^ aWi + {I - a)W2. (9) 

Since Wi ^ W2 there is a stationary p e 7'(A^,Ec) and 
/ e e B^ with 



Note that applying the ergodic decomposition of p (cf. [13] 
or [32] Sec 1.4) we may assume that p is already stationary 
ergodic. Then using eq. ^ we obtain a convex decomposition 
of the joint input-output state for W: 

i'p,w = aipp^Wi + (1 - a)i^p,W2, 

with ^pp^Wiif ®b)^ i^p,W2{f ^ b). This shows that W can 
not be ergodic. 

1) ^ 2). Assume that W is extremal and that there are 
a stationary ergodic p E V{A^ ^Y^c), a £ (0,1) and two 
stationary states V-'ij''/'2 on B{A^,Y,c) ® B'^ with ^jji ^ ip2 
and 

ipp,w = aipi + {1 - a)'ip2- (10) 

Our goal is to construct stationary cq-channels Wi , W2 : A^ x 

B^ -^C,Wi^W2, with 

W ^aWi + {l-a)W2, (11) 

and 

Mf®b)= J f{x)W,{x,b)p{dx), z = l,2 (12) 

for / e B{A^,Y^c),b e B^. This would contradict the 
assumed extremality of W and we are done. To this end, by 
ergodicity of p and using the convex decomposition of ipp^w 
in eq. (fTOl l we see that 

rS(A^,I]c) -p, ^ = l,2, (13) 

holds. 

Fix b E B^ and define a linear functional 4,i : B{A^, Ec) 
C by 

ibM) ■■=Mf(^b). 

It is obvious that 

II^mII<INI, 

holds, b can be written as a complex linear combination of 
four positive elements bk E B^, k = 1, ... ,4, e.g. consider 
first b = ^{b + b*) + i-^{b — b*) and then apply functional 
calculus to both hermitian summands to obtain positive and 
negative parts of them (cf. [18] section 4.1). For each / > 
we have then 

KAf) = Mf<^bk)<\\bk\\Mf^'i-B-) 

= \\bk\\p{.f) (by eq. (O). (14) 

Thus by the dominated convergence theorem, for each se- 
quence fj \ 0, fj E B(A^,S,), we have h.Afj) \ 
and this implies that the functional lbf.,i is representable by 
a unique finite measure p^^.j on (A^, E^) and this finite 
measure is absolutely continuous with respect to p by eq. 
(fl4l i. Therefore Z^.i is representable by a complex measure pi, i 
which is absolutely continuous with respect to p (cf. [30] for 
definitions and background information). Thus we can infer 
from the Radon-Nikodym Theorem (cf. [30] for a version 
concerning complex measures) that there is a C-valued Borel 
measurable function pi{-,b) with 

hAf) = Mf (^b)= f f{x)p,{x, b)p{dx). (15) 
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In the next step we show that pi can be changed on a set of 
p— measure such that the resulting map Wi has the following 
properties: 

a) Wi(x, •) is a state on S^. 

b) W,(r,„x, b) = W^{x, Toutb) for all x€A^,b€ B'^. 
We consider a basis {uij)fj^i of B consisting of matrix units 

(i.e. UijUki = Sj^kUii, u*j = Uji and J2i=i = le) ^"d, 
for n e N, we consider the tensor product basis j»i ) 
of ^[""'"1 built up from the single site basis above. Let Vn 
denote the set of all complex linear combinations of these 
basis vectors in S'^"'"' whose real and imaginary parts are 
rational numbers. Thus 14, is a countable linear space over the 
field Q + iQ. If n' > n there is natural inclusion Vn C 14,' 
which is induced by the quasi-local structure of B"^. Then 
the linear space over Q + iQ given by V := UneN 
also countable and dense in B^. The idea is to construct Wi 
first on V and then to show that the extension to B'^ inherits 
the properties a), b) above. This parallels the construction of 
the regular conditional probability in probability theory, the 
difference being only that we use an algebraic language. 
An inspection of the integral formula, eq. ( fTSl l, together with 
some standard measure-theoretic arguments (cf. Theorem 1 .40 
in [30]), show that for s, t e Q + iQ, b,b' e V and b" G V 
with 1 > 6" > the sets 

Ls,t,b,b' := {x e : pi{x,sb + tb') ^ spi{x,b) +tpi{x,b')}, 
Pb" ■.= {xeA^:p,{x,b")i[0,+oo)} 



and 



U := {x e A^ : p.,{x,l) ^ 1} 



have p— measure 0. Since V is countable, union of all these 
sets has p— measure 0. Thus for 

TVi :=C/U U L,^tAb'(^ U ^16) 

s,t,b,b' b"eV:l>b">0 

we have p{Ni) — 0. 
For b E V define 

S{b) := {x e A^ : pi{TinX,b) ^ pi{x, Toutb)}. 

Using (g) Toutb) = Mf ® b), eq. dBJl, the change of 

variable formula and Ti„— invariance of p it is easily seen that 

Pi{x,b) ^ Pi{T~^x, Toutb) p-a.s., 

and this is equivalent to 

Pi{TinX, b) = pi{x, Toutb) p - a.s. 

since our shifts are invertible. We have p{N2) = for N2 := 
Ubev Sib). Set 

E{b) {xeA^: W{x, b) ^ api{x, b) + {1 - a)p2{x, b)}, 
for beV. 

Eq. ([Toll and eq. ^ imply that p{E{b)) = for all b€V. 
Set N3 := Ufogy E{b). Then ^(TVg) = holds. 
For Ni given by eq. dTU, N2, and N3 set N4 := N1UN2UN3 
and define N := IJfcez T^^N^. Then C N and T^^N = N, 



moreover we have p{N) — since p is T^n— invariant. We are 
now in position to define Wi on V: 

Pi{x, b) if X £ N'' 
W{x, b) else 



Wdx,b) 



(17) 



for i — 1,2 and b E V. Wi{x,-) is by construction a 
positive linear normalized functional on V, and Wi{TinX, b) = 
Wi{x, Toutb) for all & e F and all x e A^. Note that 
the completion of each 14 is i?!""'"] (T4 is even a normed 
*— subalgebra of ^[""'"1 since we have used a basis consisting 
of matrix units in construction of Vn). It is fairly standard fact 
that we can extend each Wi{x, ■) to ^Sl""'"! while preserving 
its norm, i.e. | |VFi(a::, •)! | = 1 on each S'""'"'. This in turn 
gives us a linear bounded extension from — UneN ■S'^"'"' 
to B^ with ||W^(a;, •)|| = 1. But then we can apply theorem 
4.3.2 from [18], which states that each bounded linear func- 
tional I defined on a self-adjoint subspace containing 1 of 
a C*— algebra with \\l\\ — 1{1) is positive. Thus we have 
constructed two stationary cq-channels Wi , W2 : A'^ x B^ 
C with properties (fTTT i. (fT2l i and Wi ^ W2 since by our 
hypothesis we have ipi 7^ ijj2. This concludes our proof. ■ 
Let £ denote the convex set of all completely positive unital 
maps E : B{A^,T,c) (g) B^ ^ B{A^,T.c) which satisfy 
eq. dD and E o [Tin ® Tout) — Tn o E, while K. stands 
for the convex set of all completely positive unital maps 
K : B^ ^ B{A^, Sc) with K o Tout = T,n o K. Then note 
that the extremality of W is brought forward to the associated 
completely positive maps K and E via Theorem 12. 5 1 Thus we 
have following 

Corollary 2.8: Let W : A^ x B^ ^ C he a stationary cq- 
channel, and consider the associated completely positive unital 
maps K : B^ B(A^,Sc) and E : B{A^,Y.c) (g) B^ 
B{A^ , Ec)- Then following statements are equivalent: 

1) 14^ is ergodic. 

2) W is extremal in the convex set of stationary cq- 
channels. 

3) K is extremal in K.. 

4) E is extremal in £. 

In light of our discussion preceding Theorem 12.61 we see that, 
if we replace the input algebra i?(yl^,Sc) by an arbitrary 
quasi-local algebra J^, the ergodicity of a quantum channel 
K : B^ ^ should be defined in the following way; K is 
stationary, i.e. KoTout = TinoK, and K is an extreme point in 
the convex set of stationary quantum channels. The equality of 
channels is defined in analogy to eq. i.e. channels Ki, K2 : 
_^ g^jj equal if for all a e A^, all b e B^ 

and each stationary state (p on 



ifiiaKiib)) ^ ipiaK2ib)) 



(18) 



holds. 

Before closing this subsection we give an example that em- 
phasizes the important role played by equality definition (|7| 
for cq-channel K, or equivalently for the maps W and E. 
Example. Let us consider the commutative C*— algebra 
with A = {0,1}. Let a e A^ he the periodic 
sequence given by a (. . . , 0, 1, 0, 1, . . .) and consider 
its shifted version Ta, where T : A^ ~> denotes the 
usual (left) shift on doubly-infinite sequences. Denoting by 
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5a the point measure concentrated on the sequence a we 
define q := \{5a + Sto)- It is easily seen that q is stationary 
ergodic with respect to T. Let us consider the channel K : 
B{A^,^,) ^ with K{f) where 1 

denotes the identity in B{A^,I]c)- It is then obvious that for 
all /, 5 e B{A^, Sc) and all probability distributions p on 
we have p{gK{f)) — p{g)q{f), i.e. the joint input-output state 
is a product state. If we choose p = q then it is easily verified 
that the set h := {(a, Ta), (Ta, a)) fulfills {T®T)-^Ii = /i 
and (q (Xi = ^, thus q (g) g is not ergodic. Hence the 

channel K is not ergodic. However, it is clear that K maps 
stationary ergodic measures to stationary ergodic measures, 
namely to q which is stationary ergodic. 
If we define the equality of channels by the requirement that 
for all / G B{A^, Ec) and all stationary probability measures 

per{A^,^c) 

p{K,{f)) ^ p{K2{f)) 

holds, then it is obvious that the channel K from above is 
extreme point in the set of stationary channels with respect to 
this notion of equality. On the other hand, if we use the notion 
of equality from eq. ( fTST i then it is readily checked that for 



and 



^{a}{x)g{Ta) + l{Ta}{x)g{a) if X e {a,Ta} 
q{g) else 



J l[a}{x)g{a) + l[Ta}ix)giTa) if x e {a,Ta} 
^'^'^\q{g) else 

we have a convex decomposition of the channel K into 
stationary channels with the weights (5,5) in the sense of 
definition in ( fTSl ) with Wi ^W2- 

B. Continuity Properties of CQ-Channels 

The idea that continuity properties of channels play a 
crucial role in establishing coding theorems is well known 
in information theory and goes back to the classic paper 
by McMillan [26]. Subsequent development of this idea in 
classical information theory showed that the most fruitful 
notion of continuity is that with respect to the distance as 
was demonstrated by Gray and Ornstein [14]. At the present 
time we do not have a notion of distance for quantum states 
having similarly nice properties as the d— distance in the 
classical setting. E.g. the d— distance is extremely sensitive to 
ergodic/mixing properties of probability measures and is much 
weaker than variational distance. Moreover the entropy rates 
are d— continuous. Nice introductions to this notion of distance 
and its application in ergodic/information theory can be found 
in the monographs [32] by Shields and [15] by Gray. In this 
paper we will restrict ourselves to the variational distance, 
which can be extended to quantum states without causing any 
problems (cf. [14] for corresponding classical definitions of 
continuity of channels with respect to the variational distance 
and disadvantages of them). 

A cq-channel W is called causal if for each n e Z, 6 e 

g(-oo,n] ^jjj ^jj x,x £ A^ with Xi = Xi for i < n 

W{x,b) = W{S;,b), 



holds. W is called input memoryless if for each n e Z, 6 G 
and all x,x £ A^ with Xi ~ Xi for i>n the channel 

fulfills 

W{x,h) = W{x,h). 

Remark 2.9: We see at this point immediately that the 
dependence of the channel on past and future inputs is 
intimately connected with the continuity properties of the 
map W. E.g. if the channel W is input memoryless and 
causal (IMC in what follows) then for each b e Ql^.n+k] 
and each x € A^ the function W{x, b) depends merely on 
the coordinates x„, Xn+i, ■ ■ ■ Xn+k, and thus can be identified 
with a continuous function on A^. In view of Lemma 12.21 
and the continuity of the linear map K : B'^ ^ B{A^,T.c) 
(which is ensured by its positivity) it is immediately clear 
that K{B^) C C{A^), the continuous C-valued functions 
equipped with || • || 00 -norm. Consequently, in this case the 
map 3 X ^ W{x, b) is continuous for any fixed b G B'^. 
Although the requirement that the channel W acts causally 
seems to be natural, the restriction to input memoryless 
channels would be a serious limitation in many situations. 
However, we expect that the effect of the input letters far in 
the past should not affect much present and future outputs. 
This motivates our next definition. 

Channel W is said to have decaying input memory (DIM 
channel for short) if for each e > there is an integer m(e) 
such that for all n e Z and all b e ^I"^'^) with < 6 < 1 



\W{x,b) - W{x',b)\ < e, 



(19) 



whenever Xi = x[ for i > n — m for all m > m{e). This can 
be compactly restated as 



lim sup 



(20) 



for all n e Z where Wx ■= W{x, •) and 



d^,n{Wx,Wy) 



sup \W{x,b)-W{y,b)\, (21) 
beB[n,=o).o<f,<i 



is the usual variational distance for quantum states. Note that 
the set maximization in eq. (I2TI 1 is performed over can be 
replaced by the set of orthogonal projections in ^1"'°°) since 
the functional which is maximized is convex and projections 
are extreme points of the set {b e 61"^°°) : < 6 < 1} (see 
[9] Lemma 2.3). 

A channel W has decaying input memory and anticipation 
(DIMA) if for each e > there are non-negative integers 
rn(e), a{e) such that for all n, e Z, fc e N and all b e B^'"-'"+^^ 
with < 6 < 1 eq. (fT9] l holds whenever Xi = x'^ for n~m < 
i < n + k + a and all m > m(e), a > a{e). Again, this can 
be equivalently described by 



lim 



sup 

ra.a — *oo n + fc+a 

x,y:x _ —\ 



dv,n,k{W^.Wy)=Q, 



(22) 



for all n € Z, A; G N where d^^n.ki', •) is defined analogously 
to dy^n in eq. (1211 1 the difference being only that we replace 

Our next lemma shows that the associated map K from 
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Lemma 12.21 of each causal DIM or DIMA channel W maps 
into the set C{A^). 

Lemma 2.10: Lot W : x ^ C he a cq-channel and 
let K : ^ B{A^, Sc) be the associated map from Lemma 

a) Following assertions are equivalent: 

1) K{B^) C C(A^). 

2) For each b e B^ 

lim sup \W{x,b) -W{y,b)\^0. 

b) If W is DIMA then K{B^) C C{A^) 

Proof: a) Simple consequence of the fact that cylinder 
sets are open. 

b) That K{b) e C(A^) for b e holds is obvious since 

every such b can be written as & = 6i + ib2 with hermitian 
&i, 62 G and each hermitian element of ^["."+'''] can 

be written as a linear combination of orthogonal projections 
in ^[". (spectral theorem) so that our DIMA condition, 
eq. (I22I 1. applies. For a general b E B^, by definition of the 
quasi-local algebra B'^, there are sequences rii E Z, ki E N 
and bi G with limi^oo = 0. By continuity 

of K this implies that limi^oo ~ -f^(^)||oo — and 

thus that K{b) E C{A^). ■ 
General Hypotheses 

1) Since we will be concerned only with DIMA cq- 
channels we will henceforth consider only C{A^) C 
B{A'^, Ec) and will consider the joint input-output state 
over C(A^)(g)S^. The ergodicity will be always defined 
with respect to this algebra. 

2) All channels are assumed to be stationary 

3) It is easily inferred from the results in [28], chapter 12, 
and [19], sections 11.3 and 11.4 that C{A^) (g) B^ is 
*— isomorphic to the quasi-local algebra {^{A) (g) B)'^. 
This is basically due to the facts that C{A^) (g) B^ 
can be seen as the inductive limit of C*— algebras 

(g)el-"'"l)„gN and each of these algebras is 
*— isomorphic to (SiA) g) We shall therefore 

consider C{A^) g) B'^ w.l.o.g. as a quasi-local algebra 
in what follows and the results from [4] apply to this 
situation. 

Consider an input memoryless causal (IMC) cq-channel K : 
B"^ C{A^). We see immediately that in this situation the 
cq-channel can be described by a family of maps W"^ : x 
6" ^ C, n e N, such that • ) is a state on 

for each := (xi, . . . , x„) E A^. Now, suppose that W is 
ergodic. If the input measure p is ergodic and if we denote 
its marginal distributions on A" by p" then we obtain for the 
marginal input-output state 

and these marginal input-output states {V'p vkIhsn define an 
ergodic state on C{A'^) (g B'^ which we also denote by ipp^w 
for notational simplicity. It is easily seen that the state i/'p w 
corresponds to the following density operator 

d;,w= E p"(2^")i^">(2^"i®^.", (23) 



where D^^ denotes the density operator of the state 

W"{x", • 2:" = {xi,...,Xn) E A" and jx") = e^, (g 
g) • • • (g e^: for some fixed orthonormal basis {ej}|^'j^ of 

A code C = {ui, 6i)f£i consists of sequences ui, . . . , um G 
A" (code words) and a family of positive semi-definite opera- 
tors bi, . . . ,bM E B" (decoding operators) with J2iLibi < 1- 
The error probability of a code C = {ui, bi)f^i is given by 

e(C) := max sup (l — W {x , bi)) , 
i=i,...,M xe[u,] 

where [ui] denotes the cylinder set generated by the sequence 
Ui. Average error probability is given by 

1 *^ 

A real number R is said to be an achievable rate for the cq- 
channel W if there is a sequence of codes (C„)„eN with 

lim inf — log Mn > R, 

n—>oo TL 

and 

lim e(C„) = 0. 

n — >C30 

The (weak) capacity C{W) of the cq-channel W is defined 
as the least upper bound of achievable rates. 
The expression for the error probabilities of an IMC cq- 
channels simplifies to 

c(C) = max (l-Wiu.M))- 

i=l,...,M 

C. Examples 

Example 1. Our first example is discrete memoryless cq- 
channel as considered by [17], [31], [38]. Clearly, such chan- 
nels are input memoryless, causal and stationary ergodic. 
Example 2. Another interesting class of channels are those 
with Markovian correlated noise from [23], [7], [8]. Let 
{Ey)y^j be a finite family of completely positive unital maps 
Ey : B ^ A where B, A denote algebras of linear operators 
over suitable finite-dimensional Hilbert spaces. Moreover let 
us consider a stationary irreducible aperiodic Markovian prob- 
ability measure /i G 7'(/^,Ec) with stationary distribution q 
and transition matrix Q = q{-\-). For each n G Z and fc G N we 
define a completely positive unital map £"„ ^ : Qbhn+k-i] _^ 

j^[n.n+k-l] 

En,k := E f^''iy'')^yi ^---^Ey^- 

This family determines a unique completely positive unital 
map E : B^ A^ whose restriction to coincides 
with En_k- It is clear by definition of E that E oTq — Tj\^o 
E holds, where Tg resp. Ta denote the shifts on B'^ resp 
A^, i.e. the quantum channel E is stationary. Furthermore, 
the condition E \ ^h.'^+fe-i] — En,k means that the channel 
is input memoryless and causal. Consider the dual map E' : 
S{A^) S{B^) which is defined by 

{E'{^)){b) :^ ^{E{b)) {b eB^^E S{A^)), 



g 



and local dual maps E'^ ,^ : 5(^["^"+*=-il) ^ 
which are given by 

where Ey is defined by 

tr{E'y{D)b) ■.= Xx{DEy{b)), 

for all 6 G S and all density operators D in A. 

For a given finite family {D'^)aeA of density operators in A let 

us consider the states ipx on y^^, a; G A^, whose restrictions 

to ^["."+'=-1] have density operators D'^^ ® ■ ■ ■ ^ D'^^ 

Set 

It is clear that for b e 

holds, and thus the cq-channel W is input memoryless, causal 
and stationary. 

Due to our assumption that the Markovian measure is irre- 
ducible and aperiodic we know that lim„^oo(Q")a'.y = liu) 
exponentially fast for all y, y' E I. Using this fact it is easily 
shown that for all 61, 63 G 8^"'"+''^ and all x e 

lim \W{x,b,T!^^,b2)-W{x,b,)W{xXutb2)\=0, 

I — 'OC 

exponentially fast. By approximating 61 , 62 G ^'^ by local 
observables it is immediately clear that 

lim \W{xMTLtb2) - W{xM)W{x,Tlutb2)\ = 0, 

/ — ^00 

i.e. the channel W is output mixing in classical terminology. 
Due to this fact and mimicking the classical calculation (see 
[1], [15] Lemma 9.4.3) using definition of the joint state, eq. 
©, it is easily shown that W is ergodic. 
Note that a similar construction starting with an arbitrary 
stationary mixing probability measure ji G 1^(1^, Sc) leads 
to a stationary ergodic IMC cq-channel. Hence our Theorems 
14.41 and 14.51 apply to this whole class of cq-channels. 
Example 3. We can easily modify last example to obtain 
channels with finite input memory; We just have to use an 
irreducible aperiodic stationary Markovian measure of order 
k in our construction. 

Further examples can be found in [20] and the references 
therein. 

III. Results on Typical Projections 

In this section we give some auxiliary results that are 
repeatedly used in the rest of the paper. 
We start with a non-commutative version of the fact that the 
intersection of two sets that have high probability must also 
be highly likely. We use the Hilbert-Schmidt inner product 
on linear operators that is given by (a, b)HS '■= tr(a*6) for 
a,b G Moreover we need the notion of partial trace 

for density operators acting on Tii (g) 0.2 which is defined as 
follows: Let D G C{ni®n2) ^ C{ni)®C{H2) be a density 



operator The partial traces of D are uniquely determined 
density operators 1?^, i 1, 2 in C{Hi) which fulfill 

tr-HiiDiai) = tr-Hi®T^2 (1^(01 (g) l-^J) 

and 

for all ai G resp. 02 G €{0.2)- Usual notation for Di 

is tr2(-D) and similarly for D2- 

Lemma 3.1: 1) Let D G C{T-C) be an operator with < 
D < 1 and tr(_D) < 1 and let (71,(72 G /^(Ti) be projections 
with tT{Dqi) ^ 1 — Ei, i ~ 1,2. Then 

tr(D(72(7i'?2) > 1 - £1 - 2y/e^. 

2) Let D G £(7Yi (K) 0.2) be a density operator and let Di :— 
tT2{D) resp. D2 tri(_D) be the reduced density operators. 
Then for any projections qi G £(7Yi), i = 1, 2, with tx{Diqi) = 
1 — we have 

tr(i:>((?i ® (72)) > 1 - £i - Ve^. 
Proof: 1) The proof consists of an elementary application 
of the Cauchy-Schwarz inequality for the Hilbert-Schmidt 
inner product. Indeed, note that 1 = (72 + (1 — (72), then we 
have 

= trp(7i) = tr(i?l(7i) 

= trp((72 + (l-(72))(7i) 

< |tr(i^(?2(7i)| + |trp(l-(72)(7i)|. (25) 

Note that 

|tr(i^(l - (72)(7i)| = |trpi/2(i _ q^)q^D^/^) < 

where we have applied the Cauchy-Schwarz inequality to 
a* := D^/^{\ — (72) and b := qiD^/^ and we have used that 
tr(Z?(l — (72)) = £2 and Xx{Dqi) < 1 hold. Thus the inequality 
(|25] | above can be rewritten as 

l-£l-\/^< HDq2qi)\. 
Using 1 = (72 + (1 — (72) again we obtain from this inequality 

l-£i-V^ < |tr(Dg2(7i)| 

< \tr(Dq2qiq2)\ + \u{Dq2qi{l - q2))\ 
= |tr(L>g2(7i'?2)| 

+ HD'^\2qi{l-q2)D'/^)\. 

The last term can be upper bounded by y/e2 as is easily seen 
from 52(71(72 < 1 and the Cauchy-Schwarz inequality applied 
to a* := -D^/^(72(?i and b :— {1 ~ q2)D^/'^. We are done now 
because 

\tTL{Dq2qiq2)\ = tr{Dq2qiq2), 

which in turn follows from (?2(?i(?2 > 0. 

2) Note that by our assumption and by the definition of the 

partial trace we have 

1-ei = tr(Z?i(7i) =tr(I?((7i®l)) 

= Xl{D{qi(g)q2))+Xl{D{qi(g){l^q2))) 

= tr(i:>(gi (g) 92)) + tr{D{qi (g) 1)(1 ® (1 - 92))) 
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The desired inequality is then a consequence of tr(Z?2'i'2) = 
1 — 62, the definition of the partial trace and the Cauchy- 
Schwarz inequality for the Hilbert-Schmidt inner product. ■ 
Our next lemma states that the restriction of the local states to 
a sequence of high-probability subspaces does not affect the 
V. Neumann entropy rate. The notions of states, stationarity 
and V. Neumann entropy rate are introduced in the appendix, 
especially in section IVII-BI 

Lemma 3.2: Let i/) G S[J?') be a stationary state with 
V. Neumann entropy rate s where is a quasi-local al- 
gebra constructed from a finite dimensional C* -algebra A. 
Let ((7„)„gN. <Zn G -4". be a sequence of projections with 
lim„^oo ''Piln) — 1- Then we have 

lim -S{qnD^,,^qn) = s. 

n— >oc n 

Proof: Define D„ := QnD^^qn + QnD^r^q^, where 
denotes the projection onto the orthogonal complement of the 
range of Then we know that (cf. [22]) 

S{D^^) < S{Dr,) = Siqr^D^^q^) + Siq^D^^q^), (26) 

holds. Now, using 

and lim„^oo i'iln ) = it is easily seen that 



lim -S{q,^D^,.q^J 



0. 



(27) 



On the other hand, consider a spectral decomposition Z?^n = 
J2i=i ^i^i with one-dimensional mutually orthogonal pro- 
jections Ci. Recall that the entropy is almost convex, i.e. 
for any probability vector a := {ai, . . . ,ak) and any set 
of density operators Di, . . . ,Dk we have S{J2^=i ^i^i) ^ 
+ H{a), (cf. [21]) where H{a) denotes the 
Shannon entropy of the probability vector a. Inserting the 
spectral decomposition above in (|26] | and using almost con- 
vexity we arrive at 

0<S{D^)-S{D^.) < ^A,5(g„e,g„ + g^e,g,t) 

1=1 

+Hii\i,...,Xd^))-SiD^^) 

1=1 

< log(2) = l, 

where we have used iJ((Ai, . . . , Xd")) = 5(-D^,n) in the third 
line, and the observation that the dimension of the range of 
each qnGiqu + Qn^i^n is ^'^ most 2 which implies the last 
inequality. This shows that 

lim -S{Dn) = s, 

n— »oo 77, 

holds. Combining this with ( |27] l and right hand half of ( |26] l 
leads to the desired conclusion of the lemma. ■ 
Our main ingredient for the proof of the coding theorem will 
be the quantum version of the Shannon-McMillan theorem 
(a.k.a. quantum AEP) for ergodic states on quasi-local alge- 
bras. We refer to [4] for a proof. 



Theorem 3.3 (Quantum AEP): Let -0 be a stationary er- 
godic state with entropy rate s on quasi-local algebra 
built up from a finite dimensional C* -algebra A. Then for any 
e > there is a sequence of orthogonal projections {tn,e)n£ti, 
tn^e G -4", such that for all sufficiently large n hold: 

1) > 

2) for each one-dimensional orthogonal projection q e A^ 
which is dominated by i„ g, i.e. q < tn.e we have 

3) 2"('*-'^) < tr(i„,J < 2"(^+^). 

Moreover, the entropy typical subspace given by the range 
of each t„ can be spanned by those eigenvectors of D^p^ 
associated with the eigenvalues /x^ „ of D^n which satisfy 

2-n(s+e) < ^^^^ < 2-"(^-^). □ 

An alternative, equivalent version of this theorem can be stated 
for dimension covering exponents (cf. [4] for proof), which are 
defined in the following way: Consider any state t/i on A?' and 
e e (0, 1). The dimension covering exponents are given by 

Ps,n{'^) ■= min{logtr((j) : q e A" projection, > 1-e}. 

Proposition 3.4: Let ^ be a stationary ergodic state on A^ 
with V. Neumann entropy rate s. Then for any e G (0,1) 

1 



lim (V;) 

n— *oo Tl 



_ holds. 



For the following lemma we need some preliminary notation. 
Let A^ denote a quasi-local algebra built up from finite 
dimensional C* -algebra A. We consider a state on A^ 
and any sequence (q'„)„GN of orthogonal projections with 
qn G A"". To the state ip and sequence (g„)„gN we associate a 
family of sub-normalized states qnD^nq^ and for each n e N 
we consider a diagonalization of these sub normalized states: 



qnD,i,r,qn 



tr(gn) 

E 

Z^l 



(28) 



where Ai „ resp. „ denote eigenvalues resp. one-dimensional 
projections onto an orthonormal basis of range of qn con- 
sisting of eigenvectors of qnO^nq^. We abbreviate /„ := 
{l,...,tr(q„)}. 

Lemma 3.5: Let t/i be a stationary ergodic state on A^ with 
V. Neumann entropy rate s and let {qn)nefi be a sequence of 
projections with qn G A" and lim„^oo ^(<Zn) = 1- Then for 
each e > there is sequence of projections {rn{s))neti with 
rnie) < qn with 

1) lim„^oo V'(^n(£)) = 1- 

2) For any one-dimensional projection r < r„(e) we have 

2-n(.+e) < tr(g„i^^.<z„r) = < 2-"^^-^). 

The range of r„(£) is spanned by those eigenvectors of 
qnD^nqn which satisfy 

2-n{s+e) ^ ^^^^ ^ 2-"(^-"). 

Proof: The proof is quite similar to the proof of Lemma 
3.3 in [4] and is, in fact, much easier in the present situation 
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because we can use now the full Shannon-McMillan theorem 
for ergodic quantum states that was not at our disposal in [4] 
(the Shannon-McMillan theorem for ergodic quantum states 
was even proved there). For readers convenience we give main 
steps of this proof. First note that 



S{qnD^^q„) = - ^ Ai,„ log Aj,„, 



is/,. 



and that according to Lemma [37 
1 



lim -S{qnDj,^qn) = s. 

n — >oo fi 



(29) 



(30) 



For (5 > we consider the set 

Sl,„,^:={^e/„:A,,„>2-"(^-*)} 
and the projection 



^l,n,(5 • — ^ ^ Qi^n- 



Then it is clear that 
1 



) <s-S V n e N, 



(31) 



and V'"('"i,«,5) = ir{q7iD^ 

'^qnfi.7i,s) hold. If we had 
limsup„^oQ ■0"(ri,„^5„) = a > for some Jo > then eq. 
dSTT l would contradict Proposition 13.41 Hence we must have 



lim v"(nnA-) = y S>0. 

For (7, (5 > we define the following sets 



(32) 



and 



Then it is obvious that for i G i?2,n.cr resp. i G T„^s,a we have 



-Ai,„logAi.„ > {s + a)\i 
n 



Ai.„logAi.„ > (s - S)Xi 

n 



respectively 



Combining these inequalities with eq. (I29b and (132b we are 
led to 

-S{qnDtf,^qn) > (s + CT)-0"(?'2,n,<T) 



where 



+ (s-,5)V'"(r„,.,5)+o(l), (33) 



qi^n, and r^^^.s ■= qi,n- 



If we had limsup„^oQ V'(''2,ri,o-o) = a > for some o-q > 
then after taking limit n ^ oo along a suitable subsequence 
in eq. (|33] | we would have 

s > s + (Toi ~ (5(1 — a) > s, 

for all sufficiently small 5 > Q,a. contradiction. Thus, we have 

lim ^"(r„,,,5) = 1 Va,5>0. 



Now, if we define r„(e) := rn.e,e we have the desired 
sequence of projections. The second item of lemma is easily 
verified since the range of r„(e) is spanned by eigenvectors of 
qnD^nqn the eigenvalues of which satisfy 2^"^'^+'^) < Ai.„ < 

IV. Coding Theorem for IMC Channels 

Our goal in this section is to show that for each stationary 
ergodic IMC cq-channel W we have 



C{W) = CHolevo(W'), 



(34) 



where 



CHoievo(M^) lim -C^{W) = sup-C„(M^), (35) 

n^oo n n n 



and 



Cn{W):^ max xb",l^") 



with the well known Holevo information 



q{p) Ex"eA" p'"'{x'"')Dx^. An equivalent formula for 



and L>^, 

the Holevo information can be given as follows 



W-) = S{D-) + S{Dl^)) - S{Dl^) 



where D^ = Z]x"eA" -P"(^")l^")(^"l ™d ^p.w is given 
in eq. ( l23T l. The equality in the last formula holds since 
SiD;^^) - 5(1?;^) (cf. [25] Th. 

11.8). Note that the limit in eq. ( [35l l exists and is actually 
equal to sup„ ^Cn{W) since we have 

Cn+7r.{W) > CniW) + Cr,^{W) 

which is easily deduced from the fact that 

X(p" ® p"", 1^"+") > x(p", W'") + X{P'"", W""), (36) 

which in turn is a consequence of the subadditivity of the 
formally defined conditional v. Neumann entropy (cf. [25] 
Theorem 11.16) and our assumption that the cq-channel is 
stationary and IMC. 

Remark 4.1 : For a probability distribution p on A" the 
formal conditional v. Neumann Entropy is, in our case, given 
by 

S{p\q{p)) ■.^S{D-^^)-S{D-^^)), 



with 



and 



The subadditivity of this quantity together with stationarity and 
IMC property of the channel means in the present situation that 
forpe 'P(yl"), p' e ViA"") 



S{p®p'\q{p®p')) < Sip\q{p)) + S{p'\q{p')) 



holds. 
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Before entering the proof of the direct part of the coding 
theorem we need some further preHminary lemmas. Let us 
consider a stationary IMC cq-channel W and a stationary 
probabiHty measure p on (A^,S]c)- Then since the states 
'4'p,w,P and ipq(p) :— ipp^w \ are stationary we know 
that their v. Neumann entropy rates exist. Let us denote by 
-^pW^^P resp. the density operators of these states 

when restricted to the algebras over n-blocks. 
Let us define formally the information rate 



i{p, W) 



lim -x{p",W") 

lim l{S{D';) + S{Dl^))~S{Dl^)) 



n—>-oo Tl 



(37) 



Note that s{tpp) ~ h{p) where h{p) denotes the Shannon 
entropy rate of the probability measure p. Due to the results in 
[29] the entropy rates in eq. ( l37b exist even for a larger class 
of periodic states. Furthermore let us introduce 

C^riW):^ sup i{p,W), 

p periodic 

aUW):= sup iip,W), 

p stationary 



and 



C,rgiW):= sup iip,W). (38) 

p stationary ergodic 



input distribution and eq. ( [36] l we obtain following chain of 
inequalities: 

1 11*"^ 



i=0 

t-1 



n t 

1 1 

nl 

1 1 



i=0 
t-1 



h+kit+Ti 



1=0 
t-1 



> -7y,ix{Pu,W'n+x{Pr.,W'^'] 
n t ^-^ 



i=0 



> 



+hx{ps,W')) 
minig{o,...,t_i} h 



n 



x(p5,W') + oil). 



This yields 



iip,W)^ lim lx{p'\W)>^x{ps,W'). 

Combining this with eq. ( [39] l and definition of Ca-g{W), eq. 
we obtain 



Cerg(W^) > CHolevo(W^) ~ S. 

Since the left side of this inequality does not depend on 6 and 
since ^ > was arbitrary we can conclude that 

Cerg(W^) > CHolevo(W^)- 



Lemma 4.2: Let W : x 
cq-channel. Then we have 



be a stationary IMC 



CHolevo(W^) - Cper(W^) = C,,,,{W) = Cerg(T4^). 

Proof: It is clear that 

CHolevo(W^) > Cper(W^) > C^v^W) > Cerg(W^), 

holds. Thus we only need to prove CHoievo(W^) = C'erg(W^)- 
By definition of the Holevo capacity eq. (|35l l we can find for 
any 6 > a probability distribution ps G 7-'(A*) with 



CHok.o{W)~S<jx{PS,W'). 



(39) 



Set p' — pf°°, then the probability measure p' is t-periodic. 
Now we define 



p:^\Y.p'oT-\ 



i=0 



It is easily seen by standard arguments that p is stationary 
ergodic. In what follows we use the abbreviation pi := p' o 
T~^. For each i e {0, 1, . . . , t - 1} and for each n e N the 
distributions p" can be written as 



Pz 



-ph^Ps ^Pn, 



where n = li + kit + Vi with Q < U + ri < 2t. Note that 
ki depends on n and that lim„^oo tt — 7 holds for all i g 
{0,...,t-l}. 

Using concavity of the Holevo information with respect to the 



We will need some simple properties of projections in ^{A) (g) 
B in the proof of the next lemma, where S{A) denotes the set 
of C— valued functions on A. We again identify d{A) with 
C'"^!. It is elementary to show that each operator a e ^{A)i^B 
can be written as 

a = E \x){x\ (g) Qx, 
xeA 

for appropriate Qx G B. Similarly, a* = a iff a* = Ox for all 
X £ A. Moreover it holds that = a iff = ax for all x e 
A. Thus we can conclude that each projection t £ d{A) (E) B 
has a unique representation 

t = J2\x)i^\®t^^ (40) 

xeA 

with projections tx £ B. This is an analogon to the represen- 
tation of a set in a Cartesian product of finite sets as a union 
of its sections. 

Lemma 4.3 (Probability bounds): Let W : A^ x B^ ^ C 
be a stationary ergodic IMC cq-channel and let p £ V{A^, Sc) 
be a stationary ergodic probability measure. Then for each e > 
there is a sequence of orthogonal projections (jn(£))neN, 
j„(e) e ® B'\ with 

lim ipp,w{jn{£)) = 1, 

n — >C30 

and which possesses following additional properties: For each 
n £ N there is subset T„ C A" and a set of orthogonal projec- 
tions (cx")a:"eT„ in with j„(e) = J^x^gt^ |a;")(a;"| (8)Ca;" 
and 
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1) lini„^ooP"(T„) = 1. 

2) For each e T„ we have 

with s('(/'p) = h{p) and an appropriate sequence with 
1 > e„ \ 0. 

3) W"{x'^,c^r.) = 1 - Sn for all e T„ and a suitable 
sequence with lim„^oc Sn — 0. 

4) For each x" G T„,all sufficiently large n e N, and each 
one-dimensional projection e < Cx^^ we have 

2-n(s(ij(p)|p)+e) ^ Vl^"(x",e) < 2~"^'*^'?^^^'^^~^\ 

(41) 

with s{q{p)\p) := s(-0p,vF) — s(^p)- Consequently, the 
dimension of the range of Cx^i can be bounded by 

2"(s(«(p)|p)-e+i log(l-5„)) <tr(c2;") < 2"'**^'?'P^IP^+'^\ 
Proof: Choose an appropriate sequence 1 > £« \ such 
that the quantum AEP, Theorem 13.31 holds simultaneously for 
'ipp,w, V'p ™d ipq{p) with En instead of e and denote by i„, tp,n 
resp. tq^n the resulting entropy typical projections of tpp^w, i'p 
resp. i'q^p)- An application of Lemma 1X11 2 yields that 

and Lemma ITTI 1 then implies that 

V'p,W'((^P:n ® tq,n)tn{tp,n = 1 - ?7ri , (42) 

with ??„<£„ + 2^e„ + Set 

where i?(a) denotes the projection onto the range of the 
hermitian operator a. Then we have 

t'n<tp,n®tq,n, (43) 

tr(C) < tr(ip,„ ® tg,„), tr(t„), (44) 

and 

^P,H/(C) (45) 

by eq. ( |42] | and eq. (03]). Lemma [331 gives us for each e > 
a sequence of projections (t")neN satisfying t" < ij^, e 
^■(yl") (g) 6", and with 

^lw{tn)>^-in lim^; = 0, (46) 

and 

2-n(s(i/'p,w-)+f ) ^ < 2~"^'*^''''''"''^~4^ (47) 

for any one-dimensional orthogonal projection r < t[[. We 

assume w.l.o.g. that rj'^ > for all n E N. 

It is readily seen from (l40l i that for each G A" we have 

(48) 

where 1 denotes the identity in S". This yields 

C= 5] (49) 

and if we define projections Cx" by 

|x")(x"|®c,. = (|x")(a;"|®l)t::(|a;")(x"|®l), (50) 



we can write 

E k">(2^"i«c,.. (51) 

with 

P„ {x" e A" : c,. ^ 0}. (52) 
For the set T„ given by 

r„ := {x" e P„ : c,.) > 1 - V^}, (53) 

we see that 

P"(7^n) < V^, (54) 
holds, since by eq. ( l46l l and eq. (ISTl i we have 

< p"(r„^)(i-v/^)+p"ra 

Set 
and 

i„(£) := (r„®l)C(r„®l) = E k")(a;"|®c,- (55) 

Applying Lemma 13.11 one more time we arrive at 

lim ipp^wUnie)) = 1. 

By our construction we have < t'^ < tp^n ® ig,n (see eq. 
(|43]|). Since r„ §5 1 commutes with i", i.e. (r„ ® = 

ij((r„ (g) 1), we have 

(r„ (g l)C(?-n ® 1) < C < tp^n ® tq^n, 

and therefore by the right hand side of eq. (|55] | for each x" G 

TCP 

This yields 

since T„ C P„ (cf. eq. (|52]|). consequently, for each one- 
dimensional projection < r„ 

2-n(5(^p)+£„) ^ ^p(|a;»)(a;"|) =p"(a;") < 2-"(^('/'p)-^"). 

(56) 

Additionally eq. ( l53T l and eq. ( |54] | yield that 

lim V"('^n) - lim p"(T„) = 1 

n — >oo n — 'oc 

hold. Note that j„(e) < t'^ and for each one-dimensional 
projection e G we have 

p"(x")W^"(x",e)=<H^(|^")(a:"|®e). 

Thus by eq. (l47T i and eq. ( |56] | for each x" e r„ and for each 
one-dimensional projection e < c^" we obtain eq. ( 1411 1 for all 
sufficiently large n for which e„ < | holds. ■ 
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Theorem 4.4 ( Coding theorem: direct part): Let W : x 
^ C he a stationary ergodic IMC cq-channel. Then there 
is a sequence of codes (C„)„gN with 



lim inf - log Af„ > CHoievo(W^) , 



(57) 



with lim„_foo e(C„) = 0. I.e. CHoievo(W^) is an achievable rate 
for the cq-channel W and consequently we have C{W) > 

CHolevo(M^). 

Proof: The proof is virtually the same as Winter's proof 
[38] for the direct part of the coding theorem for memoryless 
cq-channel. The difference to the present situation is only 
that we use our Lemma 14.31 instead of frequency typical 
and frequency conditionally typical subspaces in Winter's 
setting which are defined in analogy to frequency typical and 
frequency conditionally typical subsets in the approach of 
Wolfowitz (see [39]). 

For readers convenience and since we shall need this argument 
in section FVl we give the main steps. 
From Lemma 14.21 we know that 

Thus, for each (5 > we can find a stationary ergodic 
probability measure p E V{A^, Ec) with 

^{p.W)>C,,,{W)-^-. 

It suffices to show that for each (5 > and A G (0,1) there is 
a sequence of codes (C„)„gN with Af„ code words such that 

1) e(C„) < A and 

2) M„ > 2"(^"s(M')-5) 

for all n > n{6, A) with an appropriately chosen A) G N. 
With the notation from Lemma |43] let n be large enough to 
ensure that 5n < ^■ 

According to Lemma |431 with e — there is G T„ and 

C:r" e B'"- with 

VF"(a;",c,„) =tr(D,„c,,0 > 1 - A. 

Set ui := x" and bi := c^'^. Note that bi is a projection. 
In the next step choose G T„ and Cr^^ with 

and set U2 := x" and &2 := -R((l — fei)c£c" (1 — (projection 
onto the range of (1 — bi)cx^i{l — bi)). Then it is clear that 

VF"(u2,52) > 1- A 

since 62 > (1 - 6i)c„2(l - bi). 

If ui, . . . ,Uk and bi, . . . ,bk are already constructed then 
choose x" G T!„ with 



i=l 

and set 

Uk+i ■■= x" and bk+i := R{{1 - 



fc 

E 

z=l 



6,)) > 1 - A, 



k 

E 

i=l 



6i)C:r"(l 



fc 

E 

1=1 



Continue this procedure until no further prolongation of the 
code is possible. Note that each bi is an orthogonal projection 



and that bibj — Sijbi holds. 

Let us write C„ ~ {ui, bi)ffjl for the resulting code and set 

bn E^*- 

1=1 

For 77 := min{l — A, j^} > we claim that 

iy"(x", 6„) = ti{DM > V Vx" G r„. (58) 
This is clear for code words. If we had 

trp,.(l-6„)) >l-rj 
for some x" G T„ \ {ui, . . . , um„}, then we had 

tr(i?,„(l-6„)c,„(l-fe„)) > l-<5„-2V^ 



> 1 

= 1-A 
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by Lemma inl and our restriction to those n for which Sn < ^ 
holds. The last inequality implies that we could prolong our 
code, what is not possible by our code construction. Averaging 
eq. dSST l with p" we obtain 



<(p)(&n) 



for all sufficiently large n. Then Proposition 13.41 yields that 

tr(6„) > 2"("('^'(''))-3) (59) 
for all n which are large enough. 

On the other hand, we see by our code construction that 
tr(&i) < tr(cM;) for all i G {1, . . . , A/„} and it follows from 
Lemma |43] 4 that for all sufficiently large n 

tr(6„) < M^2"^'>i9ip)\p)+i) (60) 
holds. Combining eq. i5% and eq. ( |60t we obtain 

This concludes our proof since CHoievo(W^) = Cei-g(T4^). ■ 
Theorem 4.5 (Coding theorem: weak converse): Let W : 
X ^ C be a stationary IMC channel. Then for each 
code Cn = {uiM)tti with M„ > 2"(<^™«"(^)+'^) we have 

1 



e(C„) > 1 - 



CHolevo(M^) 



CHolevoW+e' 

where e(C„) denotes the average error probability of the code 
C 

Proof: This is an easy consequence of Fano inequality 
and Holevo bound and the proof is identical to that in 
the memoryless case. For completeness we provide the full 
argument; We may suppose that X^f^a = 1 since otherwise 
we can add fcA/„+i 1 — X]f=i to bi without affecting 
code performance. We define a stochastic matrix by setting 

K{j\i) := tx{Du,bj) (^ = 1, . . . M„, j = 1, . . . , Af„). 

Consider the probability distribution p on A" which assigns 
probability to each of the code words ui. Then by Holevo 
bound [16] we have 



Xb, W^)>I{p,K), 
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where I{p, K) denotes the mutual information computed with 
respect to given input and channel data p and K. Combining 
this inequality with definition of the Holevo capacity ( [35] l and 
with Fano inequality (see e.g. [39]) we obtain 

CHoievo(VK) - supyC/(Ty) (by eq. (I35l)) 

> -Cn{W)>-I{p,K) 

n n 

> - (log M„ - 1 - e(C„) log(M„ - 1)) 
n 

> (1 - e(C„))- log(2"(^«"'™(^)+^)) - -. 



This yields 



as desired. 



V. Extension to DIMA Channels 

The main ingredient in our proof of the direct part of coding 
theorem for IMC channels. Theorem 14.41 was Lemma 14.31 on 
probability bounds. There we heavily used our assumption 
that the chanel was IMC. If, instead of IMC condition, we 
merely assume that the channel is stationary ergodic we obtain, 
by an inspection of the proof of Lemma 14.31 corresponding 
probability bounds for the induced channel 

w-i.-,b) .,^ M\^;;)(f\^b) (61) 

p"-[x"-) 

which is defined for all a;" G A" with p"(a;") ^ for a fixed 
stationary ergodic p e V{A^^ E^). Indeed, we may w.l.o.g. 
restrict ourselves to those x" G A" with p"(a:") = p([x"]) ^ 
0, where [x"] denotes the cylinder set generated by x". Then 
observe that 



VF'"(x",6) 



1 



W{x,b)p{dx) [beB''), (62) 



p"(x") 

by our definition of the joint state, eq. (|2]i. Now, it suffices to 
replace T4^" by VK'" in the proof of Lemma 14.31 to obtain the 
desired extension of that result. Note that the code construction 
in Theorem 14.41 depends only on Lemma [43] thus we obtain 
for all (5 > 0, A e (0,1) and all sufficiently large n E N codes 
C„ = {u„b,)f!^l with 

1) maxig|i,...,M„}(l- W^'"(m»,6»)) < A 

2) Mn > 2"(C'"8(^)-*), 

where Ceig(W^) is now defined as 

C,,s{W):= sup t{p,W), (63) 

p stationary ergodic 



with 



i{p, W) := s{ipp) + s{ipq(^p)) - s(V'p,w)- (64) 



Since W'^ appears in the first item above, the code we have 
obtained is not a code for the channel W with prescribed 
error probability. If we assume in addition to stationarity and 
ergodicity that W fulfills DIMA condition , eq. (l22T i. the code 



above is easily converted into one with low error probability 
for W: By eq. (|62| i we have 



1-A< W^'"(m„6, 



1 



W{x,b,)p{dx), 



for all i E {!,..., M„} so that for each i there must be at 
least one x{i) E with 

W{x{i),b,) > 1 - VA. 

Employing the DIMA condition we find positive integers m, a 
such that 

\W{x,b,)-W{x{i),b,)\ < %/A, 

for all i E {1, . . . , Af„} whenever xj — x{i)j for 1 ~ m < 
j <n + a+l. Thus setting u'^ and b[ := 6, E 

B" C we obtain the desired sequence of codes 

for the channel W after shifting the sequences u[ as well as 
the decoding operators b[ m places to the right. Thus we have 
proved 

Theorem 5.1 (DIMA Coding theorem: direct part): Let 
W : A^ X ^ C be a stationary ergodic DIMA cq-channel. 
Then Cerg(VF), defined in eq. ( |63] |. is an achievable rate, and 

thus C(W) > Cerg(P^). 

In the proof of the converse part we shall need the periodic 
product information capacity which is defined by 



Cpp{W):^ sup tip,W), 

p periodic product 



(65) 



i{p, W) being given by eq. ( |35] |. 

Theorem 5.2 (DIMA coding theorem: weak converse): Let 
W : X B'^ C he Si stationary ergodic DIMA cq-channel. 
Then Cpp{W) = Cag{W) holds. Moreover, for each e > 
and any code C„ = {u^,b^)ftl with Af„ > 2"(^fp(^)+^) we 
have 



e(C„) > 1 



Cpp{W) 



Proof: We divide the proof in two parts. In the first 
part we infer from Holevo bound and Fano's inequality that 
Cpp{W) > C{W) holds for each stationary cq-channel W. 
For DIMA channels we then know that Cpp{W) > C{W) > 
Ceig(VF) from the direct part of the coding theorem. Theorem 
15.11 Finally we show that Cpp{W) — Cerg(M^) which is a 
consequence of the affinity of the v. Neumann entropy rate on 
periodic states and the fact that shifting a periodic state one 
site to the left/right does not change its entropy rate. 
Let Cn = {ut,bi)fi\ be a code for W with M„ > 
2n(Cpp(VK)+e) denote the disti-ibution on A" which 

assigns probability to each of the code words Ui, i — 
1, . . . , A/„ and consider the product probability measure p :— 
■ ■ -Pn^Pn^Pn - ■ ■ with period n on {A^ , Sc). Then we have 
h{p) = ^logAf„. Moreover, we need the family of induced 
channels {M^''};gN from eq. jFTT l, or equivalently eq. ( |62l ), 
defined with respect to the periodic product measure p. If we 
denote by D\p the density operator of the state i/ip ^, then 
we have 



D 



p'(x')|x')(x' 
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where D^i is the density operator of W^''(x', ) and 
indicates that the summation is performed only over those 
with (x' ) > 0. Note that then for each / e N 

X{p', W") = Sii^D + S{^l^\^^)) - S{4>lw) (66) 
holds, and thus we have 

1 



lim ^x{p\w"), 

I — >oo i 



(67) 



by eq. ( l64b . If we introduce the average error probability 
e{C„, W) with respect to the induced channel W', i.e. 

e(C„, W) := — ^(1 - W'^iu,, h)), (68) 

i— 1 

then we can argue in a similar fashion as in the proof of 
Theorem 14.51 and obtain 



x{Pn,W"') > (1 - e(C„,M/'))log(2"(^-('^)+^)) - 1. (69) 

For each Z G N we write I = kn + r, < r < n, and using 
superadditivity of x with respect to product states together 
with the n— periodicity of the channel and resulting states, 
which follows from the subadditivity of the conditional v. 
Neumann entropy (see Theorem 11.16 in [25]), we arrive at 

(70) 

Combining eq. ( iTOl i. eq. ( |69] | and dividing by / we obtain 

]x{p',W") > ^x{Pn,W'^l+o(l) 

> y((l - e(C„, W')) log(2"('='^''(^)+^)) - 1) 
+0(1) 

= ^{{l-e{C,„W'))n{Cpp{W)+e)-l) 

+0(1). 

Taking the limit / oo, taking into account eq. ( |66l l, eq. ( |67] i, 
definition of Cpp{W), eq. (l65T l. leads to 

CppiW) > i{p,W) 



> (l-e(C„,iy'))(Cpp(W^)+e)- (71) 



or, equivalently 



e{Cn,W') > 1 



Cpp(iy) + e 



(72) 



From definition of average error probability, eq. (l24l) . we infer 
that e(C„) > e(C„, Vt^') and thus inequaUty yields 

.(C„)>.(C,.,»'')>l-S^^ (73, 

This shows that for each stationary cq-channel W the inequal- 
ity 

CppiW) > C{W) (74) 

holds. If W is in addition DIMA, then the direct part. Theorem 
Isn yields with eq. 

CppiW) > C{W) > C„g(W^). 



Thus we have to prove the converse inequality 

CppiW) < Cerg(VF). (75) 

For any S > there is a product probabihty measure p on 
iA^, Sc) with period t eN with 



CppiW) - S < lip, W). 
Then the probability measure given by 



(76) 



P 



1 



in 



1=0 



is stationary ergodic. Note that the joint input-output state and 
output state depend affinely on the input probability measure. 
Using the defining formula ^ for joint input-output state and 
resulting output state together with change of variable formula 
one immediately sees that 



and 



'^qipoT~')ib) = ^qip)iToutb) 



hold. Arguing as in the proof of Theorem 3.1.3 in [4] shows 
that 



^i%oT-\w) = s(^,(po7;-')) = sii^qip)), 



and 



sipoT.r;)^sip). 



Using this and the affinity of the v. Neumann entropy rate on 
periodic states, we obtain 

t-i 

iip', M/) = - ^ i(p o T-;, W) = iip, W). 

i=Q 

This inequality together with ( |76] l shows that inequality ( |75] l 
is valid since S > was arbitrary. ■ 



VI. Classical Capacity of Output Weakly Mixing 
Quantum Channels 

In this section we show that the results obtained so far imply 
immediately capacity results for the transmission of classical 
information through output weakly mixing quantum channels. 
The extension to ergodic quantum channels is postponed to the 
future work in order to keep the size of this paper reasonable. 
We consider two finite-dimensional Hilbert spaces Hi, 7^2 and 
corresponding algebras of linear operators A := CiTii), B := 
£(7^2) together with quasi-local C*— algebras A^ and B^. A 
quantum channel is described by a linear, completely positive, 
unital map K : B^ ^ A^. K is called (Ti„, To„t)- stationary 
if KoTout = TinoK holds for the shifts Tout resp. r,;„ on B^ 
resp. A?'. The quantum channel K is iTin, Tout)— ergodic if 
it is extreme point in the convex set of (Ti„, To„t)— stationary 
quantum channels. 

A convenient sufficient condition for ergodicity of a stationary 
channel K : B^ ^ A^ , us in the classical theory (cf. [15]), 
is that it is output weakly mixing: A quantum channel K is 
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said to be output weakly mixing if for any state ip E S{A^) 
and all 61,62 € 



For any 77 > there is a positive integer n and an ensemble 

te:'/'j"=isuch that 



- n— 1 / m ^ 

lim - J2 W{K{biTl^,h2)) - ^{K{b,)K{Tl^,h2))\ =0 s( J^P.V^, o if 



(77) 

holds. Obviously, the condition in eq. (|77] l need only to be 
checked on elements in 8''°'^ (see section IVII-Bb . The proof 
that each output weakly mixing channel is ergodic will be 
given in a forthcoming paper. We will merely show here how 
such a channel induces an ergodic cq-channel. 

Remark 6.1: It is readily seen that the condition dTTT i for 
cq-channels is equivalent to 



i=l 



p^Siip.oK) > CHolevo(if)-?7 (79) 



1 



n-1 



lim - ^ \W{x,hT^^tb2) - W{x,b,)W{xXutb2)\ = 

i=0 



for all X e A^, which reflects the classical condition of a 
channel of being output weakly mixing as given in [15]. 
The continuity notions of section III-BI are easily extended to 
the present setting. E.g. channel K : ^ is said to 
have decaying input memory and anticipation (DIMA) if for 
all n e Z, fc e N 

lim sup dn^ki^ ° if' o K) = 0, 

m,a — >oc t^cr at'^ / ' 



where 

dn,k{^°K, (p'oK) 



sup \(poK{b)-(p'oK{b)\, 

bge["-" + ''l:0<f)<l 



and (p —m,a.n,k 'fi' mcans that the states ip and ip' on A^ have 
equal restrictions to _4[n-™,n+fc+a]. 

A quantum channel K : A^ is called input memoryless 

and causal (IMC) if if(S["^"+'=l) c for ah integers 

n and k with fc > 0. 

We denote, as before, ^[^'"1 by A"' with a similar abbreviation 
for B^^'^y A code of length Mn for transmission of classical 
information via quantum channel K : B^ ^ A^ is a 
family C„ :— {ipi,bi)f^^ consisting of states ipi E S{A"), 
i = 1, . . . , Mn, and decoding operators < bi E B" with 

The error probability of a code C„ is given by 

e(C„):= max sup {1 ^ ^^{K{b,))), (78) 

where <pf denotes the restriction of ip>i to A^. The capacity 
of the channel K is then defined in the usual way. 
We consider a stationary output weakly mixing IMC quantum 
channel K : B^ ^ A^. Our goal is to show that the classical 
capacity of this channel is given by 

1 



CHoievo(if) := lim -CniK), 

n — >oo fl 



where 

Cn{K) :-- 



sup 

{pi,v>i}7Li 




Piipi o K \ - "^piSiipi o K) 



and the least upper bound is taken over all ensembles on A^, 
i.e. Pi > with J^TLiPi — 1 ^"d each (pi is a state on 



holds. Let A {1,2, . . . , m} and for each x E A'^ consider 



the states p^ 



)ipx_i®Vxo®Vxt 



on A . Moreover 



we consider the stationary product distribution p built up from 
the probability vector Let : x 6^ -> C be the 

cq-channel given by 

W{x,b) -.^ ipxiKib)). 

It is readily seen using our assumption that K is stationary 
and output weakly mixing that the cq-channel W is stationary, 
IMC and output weakly mixing. A calculation similar to the 
proof of Lemma 9.4.3 in [15] shows that W is ergodic. Thus 
all results from section |IV] apply and show that by Theorem 
14.41 there is a sequence of codes C„ of lengths Af„ for the 
cq-channel W with 

liminf - logM„ > CHoievo(W), 

n— >oo n 

where CHoievo(W^) is defined in eq. ( |35] ). It is obvious that 
the codes C„ for W generate codes C'„ for K with the same 
lengths and error probabilities. Note that 

(m \ m 

i=l I 1=1 

And thus from (l35l l and (|79] | we can infer that 

liminf - log A'/„ > CHoievo(A') - i] 

holds. This shows that all rates below CHoievo(^''^) are achiev- 
able. The weak converse is shown in the same vein as in the 
memoryless case, see the proof of Theorem 14.51 
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VII. Appendix 

A. C*— Algebras, States and 1- Actions 

C*— algebras are axiomatic generalizations of well known 
objects, such as the continuous functions over compact spaces 
or bounded operators over finite- or infinite-dimensional 
Hilbert spaces, which, additionally to their algebraic structure 
given by possibility to add and multiply elements, have an 
adjoint operation and norm defined on it. Excellent introduc- 
tion to basic concepts and methods relevant for applications 
of C*— algebras is given in [18]. 
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Let ^ be a linear space over the field C which is addition- 
ally endowed with a distributive and associative product. An 
adjoint operation * : ^ ^ ^ is an anti-linear map (i.e. 
(Aa + ^a')* ~ Xa* + fia'* ) with a** — a and (aa')* — a'* a* 
for a, a' e .A, A, /i e C. An algebra A over C equipped with 
a adjoint operation is called * —algebra. 
A *— algebra ^ is a C*— algebra if there is a norm || • || : 
.4 ^ [0, oo) such that 

1) A IS, complete with respect to || • ||. 

2) \\aa'\\ < \\a\\ ■ \\a'\\ for all a, a' e A. 

3) \ \a*a\ \ = ||a||2 for all a e A. 

Standard examples of C*— algebras are: 

1) The set C{X) of continuous C— valued functions on a 
compact Hausdorff space equipped with the sup-norm 
II • Woo- The adjoint operation is given by complex- 
conjugation of functions. 

2) The set S(7Y) of bounded operators acting on a Hilbert 
space with the operator norm. * is then the usual adjoint 
operation. 

3) Quasi-local algebras in the next section IVII-BI 

AC* — algebra A is called unital if there is an element \ ^ A, 
called the identity, with la = a for all a G .4. In this paper 
we will be concerned only with unital algebras. 
A state on a unital C*— algebra is a C— linear functional 
■.A->C with 

1) i!{a) > for all a €E .4 with a > 0. Here a > means 
that there is b e A with a = b*b. 

2) V(l) = 1. 

For a compact Hausdorff set X, states on (C(X), || • ||oo) can 
be uniquely associated to probability measures on {X, J^sorei) 
via Riesz-Markov representation theorem (see [30] Theorem 
2.14). 

If we consider a finite-dimensional Hilbert space Ti and 
A = B{T-C) then using the Hilbert-Schmidt inner product 
{a,b) = tr(a*6), a,b E A, and Riesz representation theorem 
from elementary linear algebra it is easily seen that each state 
ip on A can be represented by a unique density operator 
D eA (i.e. D ^ D*, D >0 and tr{D) = 1), i.e 

V'(a) tr(L»a) Va G A. 

A * —automorphism of a C*— algebra is a one-to-one, onto 
linear map T : A ^ A with T{a*) = {T{a))* and T(aa') = 
T{a)T{a') for all a, a' E A. Any * -automorphism induces a 
Z— action on A, i.e. a family {Tz)zei, of *— automorphisms 
with rjj_|_22 — o and Tq = id^. This family is given 
hy Tz :— T^, z G Z. Conversely, each Z— action is given in 
this way, simply set T := Ti. 

A state '(/' on ^ is Z— invariant, or, equivalently, T— invariant, 
if ipoT — ip holds. It is obvious that ipoTz = tp for all z E Z. 
The set of Z— invariant states is convex. An Z— invariant state 
■0 is called ergodic if is an extreme point in the convex 
set of Z— invariant states. Somewhat more concrete example 
of a Z— action and a necessary and sufficient criterion for the 
ergodicity, that parallels the classical setting, can be found in 
the next section IVII-BI 



B. Quasi-Local Algebras and Ergodic States 

Quasi-local algebras are used to describe interacting systems 
of infinitely many spins over a lattice Z'* in quantum statistical 
mechanics. We will consider only the case d — \, but 
constructions generalize immediately to arbitrary dimension. 
A readable introduction to quasi-local algebras and ergodic 
states on such algebras is given in [35]. 
Let us consider a finite-dimensional C*— algebra A. For 
definiteness, let us consider the case that either A — C{T-C) 
is the C*— algebra of linear operators acting on some finite- 
dimensional Hilbert space H, or A — d{A), where ^{A) 
denotes the set of all C— valued functions on a finite set A. 
These two examples will suffice for our concerns. Suppose 
that to each n e Z we attach a copy An of A. Let A C Z 
be a finite set and let us define A^ :— {S)„eA-4n- is 
called the algebra of observables belonging to sites in A. For 
A C A' C Z, both finite, there is a natural embedding of .4^ 
into A^ given by 

A^ 3 ai-^ a(g) 1a'\a e A^' , 

where 1a'\a denotes the identity in A^ Note also that 

\\a(g) 1a'\aIUa' = ||a|UA 

holds for all a E A^. Moreover, if for two finite subsets Ai, A2 
of Z we have a E A^^ and a' E A^^ then the product, adjoint 
operation, and other algebraic constructions can be naturally 
preformed in the larger algebra A^^ . Thus, if we set 

:= U A^ 

ACZ:|A|<oo 

we obtain the normed *— algebra of local observables. Its 
norm-completion 

A^ := A^"" 

is then called the quasi-local C*— algebra built up from A. 
For example, if ^ = ^{^) then it is an immediate consequence 
of Stone- Weierstrass theorem (cf. [18] theorem 3.4.14) that 
A^ — C{A^), the set of continuous functions on with 
1 1 • I loo— norm, where A^ is equipped with the product topology. 

Remark 7.1: Note the similarity of the construction of 
quasi-local C* -algebra to the construction of cr-algebra Ec 
on space of doubly-infinite sequences drawn from some finite 
alphabet A (cf. [32]): For the latter purpose one starts for 
each n E N with algebra I]„ of sets which is generated by 
the cylinder sets with the base in yl^^" "1 and observes that 
Sn C S„+i. Then it is clear that Sjoc := UnsN ^" 
algebra of sets. Then Sc is simply defined by Sc := a{'Sioc), 
i.e. as the cr-algebra generated by Sfoc- The main difference 
to the quasi-local setting lies in the kind of approximation. In 
the quasi-local algebra each observable can be approximated 
uniformly by local ones, whereas the approximation in Sc 
means that for each probability measure p on {A^,I]c) and 
each A E J^c there is a sequence (A„)„gN in Hioc with 
linin~,ocP{AAAn) = 0. Here A denotes the symmetric 
difference of the sets. 

Any state ip on A^ induces a family of states (V'^)acZ:|A|<oo 
on yl^. A C A' implies V^' \ = i-e. the family 
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(V'^)acZ:|A|<oo is consistent. Conversely, any consistent fam- 
ily of states ('i/''^)AcZ:|A|<oo defines a state -0 on J^. 

Remark 7.2: At this point we have again a nice analogy 
to the classical case. According to Kolmogorov's consistency 
theorem (cf. [32] for a concise discussion) any probability 
measure p on (A^, Ec) is uniquely determined by (or can be 
constructed from) the set of its finite-dimensional marginal 
distributions on A'"'™', n < m, n, m e Z. 
A shift T -.A^ on A^ is induced by 

A'^ 3 a~ a®lj^^ T{a) := 1^ » a ~ a e A^"^^ . 

Note that the shift T is a *— isomorphism, i.e. it is linear, 

fulfills T{a*) = {T{a)y and T{ah) = T{a)T{b) for all a,b e 
A^. 

A state ip on A^ is called stationary if ip o T = ip holds. 
The set of stationary states on A^ is convex. A state ip on 
A^ is called ergodic if it is an extreme point in the set of 
stationary states. It can be shown (cf. [35] theorem 1.7.10) 
that, for quasi-local algebras and shifts, the statement that ij} 
is ergodic is equivalent to 

1 " 

Ji^m ^i^T\b)) = V(a)0(6) (80) 

i— — n 

for all a, 6 e A^. 

The V. Neumann entropy rate of a stationary state ij^ on is 
given by 

s(V') := lim is'(V'"), (81) 

n^oo n 

where f [1, n] {1,2,..., n}, and 

5(0") :=-tr(i^^„ logD^^) 

is the V. Neumann entropy. D^n £ ^[^'"1 denotes the density 
operator of ip". Note that the limit in eq. dSTT l exist and equals 
inf„gN ■^S{ip") since v. Neumann entropy is subadditive and 
tfj is assumed stationary; 

5'(0"+") < S'(V'") + 
For a proof of the last inequality see e.g. [25]. 

C. Completely Positive Maps and Quantum Channels 

In this part of the appendix we provide the basic definition 
of complete positivity and make an attempt to explain how this 
fits to the notion of a channel from the classical information 
theory. The standard reference for completely positive maps 
is the monograph [28] by Paulsen. 

Let A, B be C*— algebras and consider a linear map E : B ^ 
A. E is called positive if E{b) > for all b e B with b > 0. 
Suppose that B,A are unital, a linear map E : B ^ A is 
unital if — 1^. 

Now, we consider additionally the C*— algebra of n-hy-n 
complex matrices M(n, C) and B M{n, C) respectively 
A^'M{n, C), both of which can be endowed with a canonical 
structure of C*— algebra (see [28] for details). Essentially, 
these C*— structures are given by identifying the members of 
B (8) M(n, C) resp. A ^ M.{n, C) with the n-by-n matrices 
having entries from B resp. A. 



A linear map £' : ;B ^ .4 is completely positive if for each 
non-negative integer n the map £^(X)idM(n,C) • B^M.{n, C) 
A <8i M(n, C) is positive, where idM[(n.c) denotes the identity 
map on M{n, C). 

A completely positive unital map E : B ^ A induces a map 
E' : S{A) — > S{B) between the sets of states via 

E' {iP) -.^ ij o E (iljeS{A)). 

Quantum channels are defined as completely positive, unital 
maps between C* -algebras. The connection to the classical 
channels is established via following result of Stinespring (see 
[36], [28]): If S or is commutative (abelian) then each linear, 
positive map E : B ^ Ais completely positive. 
We will deal only with the simplest case in order to recover 
the classical channels. To this end let us consider two finite 
sets Y and X and a stochastic matrix W : X ^ Y, i.e. 
W{y\x) > and T,yeY^iy\^) = 1 foi" x e X. Define 
E : m) ^ ^(X) by 

Eif){x) ^ 

yeY 

This map is obviously linear, positive and unital. And it is 
clear that each linear, positive and unital map is representable 
in this way. The induced map E' : 'P{X) ViY) between 
the sets of probability distributions is then easily calculated: 

E'{p){y)^Y.p(^)^(y\^)' (ye^) 

which is exactly the output distribution of the channel for the 
stochastic input p E ^^{X). This shows that classical channels 
fit nicely into the theory of completely positive maps. 
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